Large vocabulary natural language speech recognition in software
نویسندگان
چکیده
This presentation provides a description and report of Dragon Systems large vocabulary natural language, speaker-dependent isolated word recognition systems. Based on stochastic processing, these systems are implemented primarily in software running on a personal computer (PC) or workstation. The only processor used in addition to the host PC microprocessor is a simple 2 MHZ 8-bit processor to assist in the real-time data acquisition of the speech signal. Additional components on a sparsely populated audio board, include an A/D converter or CODEC with some TTL logic for dynamic range/gain control. A brief chronology of test performance is reported. All speech training and test data were recorded in typical office environments, using an irrexpensive hand-held omnidirectional cassette-recorder type mil_(e. Although these tasks can be processed both off and on-line, the on-line facility provides a graceful interactive user interface. If more than I word is deemed probable by the recognition process, then a short rank-ordered "top choice" menu is affered to the user in-line. If the top word is correct, confirmation is implicit by continued speech or delay. If the correct recognition is ranked 2 or more, the user may quickly select it, with a single-keystroke, for example, and continue. New words or phrases can easily be introduced automatically in a few seconds during the dictation. Full voice console and dictionary facilities run concurrently for immediate user access. An application interface enables most MS.DOS software including popular word processors, spreadsheets, and database programs, to accept both voice and keyboard entries flexibly. PERFORMANCE TEST SUMMARY Speaker Active Concurrent I'C Response Totall ' ' Date Gender Vocab. Perplexity Language Prucessor Time Test correct top J.todel (avg.) \-Iords 5 Task: General Prose Dietation 7/84 F 1277 1277 No 4.5MHZ8088 22sec 268 87.71 100\ Task: Enaineerina Documentation Dietation 11/85 H 2000 not calculated Yes 6MHZ80286 l.Ssec 3050 92.9\ 97.9\ Task: Natural Langu~qe Data Base Inauiries neac 9/86 F 868 868 No 8HHZ802S6 real846 92.4\ 98.6\ time Dragon Systems, Inc., Chapel Bridge Park, 90 Bridge Street, Newton, MA 02158, U.S.A.; Tel. (617) 965-5200, Fax. (617) 527-0372 Copyright © Dragon Systems, In~. 1987
منابع مشابه
Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملExtension of hidden markov model for recognizing large vocabulary of sign language
Computers still have a long way to go before they can interact with users in a truly natural fashion. From a user’s perspective, the most natural way to interact with a computer would be through a speech and gesture interface. Although speech recognition has made significant advances in the past ten years, gesture recognition has been lagging behind. Sign Languages (SL) are the most accomplishe...
متن کاملCSLM - a modular open-source continuous space language modeling toolkit
Language models play a very important role in many natural language processing applications, in particular large vocabulary speech recognition and statistical machine translation. For a long time, back-off n-gram language models were considered to be the state-of-art when large amounts of training data are available. Recently, so called continuous space methods or neural network language models...
متن کاملSpeech recognition with automatic punctuation
We present a method of speech recognition with automatic punctuation based on a combination of acoustic and lexical evidence. In the recognizer vocabulary, punctuation marks are treated as word entries. By assigning the acoustic baseforms of silence, breath, and other non-speech sounds to punctuation marks, and using a properly processed N-gram language model, unpronounced punctuation marks of ...
متن کاملA large vocabulary continuous speech recognition hybrid system for the portuguese language
Due to the enormous development of large vocabulary, speaker-independent continuous speech recognition systems, which occur essentially for the US English language, there is a large demand of this kind of systems for other languages. In this paper we present the work done in the development of a large vocabulary, speaker-independent continuous speech recognition hybrid system for the European P...
متن کاملThe IBM conversational telephony system for financial applications
We describe our development work on a telephonebased conversational system in the domain of mutual fund transactions. This system uses several components including robust large vocabulary continuous speech recognition, natural language understanding, dialog management, and text-to-speech synthesis technologies.
متن کامل